AITopics | Osage County

Collaborating Authors

Osage County

LLMs as Method Actors: A Model for Prompt Engineering and Architecture

arXiv.org Artificial IntelligenceNov-11-2024

We introduce "Method Actors" as a mental model for guiding LLM prompt engineering and prompt architecture. Under this mental model, LLMs should be thought of as actors; prompts as scripts and cues; and LLM responses as performances. We apply this mental model to the task of improving LLM performance at playing Connections, a New York Times word puzzle game that prior research identified as a challenging benchmark for evaluating LLM reasoning. Our experiments with GPT-4o show that a "Method Actors" approach can significantly improve LLM performance over both a vanilla and "Chain of Thoughts" approach. A vanilla approach solves 27% of Connections puzzles in our dataset and a "Chain of Thoughts" approach solves 41% of puzzles, whereas our strongest "Method Actor" approach solves 86% of puzzles. We also test OpenAI's newest model designed specifically for complex reasoning tasks, o1-preview. When asked to solve a puzzle all at once, o1-preview solves 79% of Connections puzzles in our dataset, and when allowed to build puzzle solutions one guess at a time over multiple API calls, o1-preview solves 100% of the puzzles. Incorporating a "Method Actor" prompt architecture increases the percentage of puzzles that o1-preview solves perfectly from 76% to 87%.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.05778

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
Europe > France (0.04)
(2 more...)

Genre:

Research Report (1.00)
Workflow (0.71)

Industry:

Leisure & Entertainment > Games (1.00)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

On Functional Activations in Deep Neural Networks

Nencka, Andrew S., Muftuler, L. Tugan, LaViolette, Peter, Koch, Kevin M.

arXiv.org Artificial IntelligenceNov-17-2023

Background: Deep neural networks have proven to be powerful computational tools for modeling, prediction, and generation. However, the workings of these models have generally been opaque. Recent work has shown that the performance of some models are modulated by overlapping functional networks of connections within the models. Here the techniques of functional neuroimaging are applied to an exemplary large language model to probe its functional structure. Methods: A series of block-designed task-based prompt sequences were generated to probe the Facebook Galactica-125M model. Tasks included prompts relating to political science, medical imaging, paleontology, archeology, pathology, and random strings presented in an off/on/off pattern with prompts about other random topics. For the generation of each output token, all layer output values were saved to create an effective time series. General linear models were fit to the data to identify layer output values which were active with the tasks. Results: Distinct, overlapping networks were identified with each task. Most overlap was observed between medical imaging and pathology networks. These networks were repeatable across repeated performance of related tasks, and correspondence of identified functional networks and activation in tasks not used to define the functional networks was shown to accurately identify the presented task. Conclusion: The techniques of functional neuroimaging can be applied to deep neural networks as a means to probe their workings. Identified functional networks hold the potential for use in model alignment, modulation of model output, and identifying weights to target in fine-tuning.

artificial intelligence, functional network, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.10898

Country: North America > United States > Oklahoma > Osage County (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Hierarchical Approach to exploiting Multiple Datasets from TalkBank

Wong, Man Ho

arXiv.org Artificial IntelligenceJun-21-2023

TalkBank is an online database that facilitates the sharing of linguistics research data. However, the existing TalkBank's API has limited data filtering and batch processing capabilities. To overcome these limitations, this paper introduces a pipeline framework that employs a hierarchical search approach, enabling efficient complex data selection. This approach involves a quick preliminary screening of relevant corpora that a researcher may need, and then perform an in-depth search for target data based on specific criteria. The identified files are then indexed, providing easier access for future analysis. Furthermore, the paper demonstrates how data from different studies curated with the framework can be integrated by standardizing and cleaning metadata, allowing researchers to extract insights from a large, integrated dataset. While being designed for TalkBank, the framework can also be adapted to process data from other open-science platforms.

artificial intelligence, data mining, dataset, (17 more...)

arXiv.org Artificial Intelligence

2306.12596

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Oklahoma > Osage County (0.04)
North America > United States > New Jersey > Bergen County > Mahwah (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Data Science > Data Mining (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)
Information Technology > Artificial Intelligence > Cognitive Science (0.30)

Add feedback

To ChatGPT, or not to ChatGPT: That is the question!

Pegoraro, Alessandro, Kumari, Kavita, Fereidooni, Hossein, Sadeghi, Ahmad-Reza

arXiv.org Artificial IntelligenceApr-5-2023

ChatGPT has become a global sensation. As ChatGPT and other Large Language Models (LLMs) emerge, concerns of misusing them in various ways increase, such as disseminating fake news, plagiarism, manipulating public opinion, cheating, and fraud. Hence, distinguishing AI-generated from human-generated becomes increasingly essential. Researchers have proposed various detection methodologies, ranging from basic binary classifiers to more complex deep-learning models. Some detection techniques rely on statistical characteristics or syntactic patterns, while others incorporate semantic or contextual information to improve accuracy. The primary objective of this study is to provide a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection. Additionally, we evaluated other AI-generated text detection tools that do not specifically claim to detect ChatGPT-generated content to assess their performance in detecting ChatGPT-generated content. For our evaluation, we have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains and user-generated responses from popular social networking platforms. The dataset serves as a reference to assess the performance of various techniques in detecting ChatGPT-generated content. Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.01487

Country:

North America > United States > Oklahoma > Osage County (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Like a Shem that brought Golem to life

#artificialintelligenceAug-4-2021, 10:40:12 GMT

A chatbot is a great way to make internet communication more pleasant for both customers and companies. At the beginning of the millennium, people would probably laugh at you for this sentence. The main reason for this change is Natural Language Processing (NLP). It is this branch of artificial intelligence science, that has transformed the clumsy and cumbersome automata into today's clever chatbots, which you can hardly tell from people sometimes. Thanks to NLP, artificial intelligence learns to understand something as complex as human communication.

artificial intelligence, chatbot, natural language, (13 more...)

#artificialintelligence

Country: North America > United States > Oklahoma > Osage County (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)

Add feedback

Patch Reordering: a Novel Way to Achieve Rotation and Translation Invariance in Convolutional Neural Networks

Shen, Xu, Tian, Xinmei, Sun, Shaoyan, Tao, Dacheng

arXiv.org Machine LearningNov-28-2019

Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in meaningful objects in input. Sometimes, such networks are trained using data augmentation to encode this invariance into the parameters, which restricts the capacity of the model to learn the content of these objects. A more efficient use of the parameter budget is to encode rotation or translation invariance into the model architecture, which relieves the model from the need to learn them. To enable the model to focus on learning the content of objects other than their locations, we propose to conduct patch ranking of the feature maps before feeding them into the next layer. When patch ranking is combined with convolution and pooling operations, we obtain consistent representations despite the location of meaningful objects in input. We show that the patch ranking module improves the performance of the CNN on many benchmark tasks, including MNIST digit recognition, large-scale image recognition, and image retrieval. The code is available at https://github.com//jasonustc/caffe-multigpu/tree/TICNN .

artificial intelligence, feature map, machine learning, (17 more...)

arXiv.org Machine Learning

1911.12682

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
Oceania > Australia (0.04)
North America > United States > Oklahoma > Osage County (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback